PCB defect detection algorithm based on CDI-YOLO

During the manufacturing process of printed circuit boards (PCBs), quality defects can occur, which can affect the performance and reliability of PCBs. Existing deep learning-based PCB defect detection methods are difficult to simultaneously achieve the goals of high detection accuracy, fast detection speed, and small number of parameters. Therefore, this paper proposes a PCB defect detection algorithm based on CDI-YOLO. Firstly, the coordinate attention mechanism (CA) is introduced to improve the backbone and neck network of YOLOv7-tiny, enhance the feature extraction capability of the model, and thus improve the accuracy of model detection. Secondly, DSConv is used to replace part of the common convolution in YOLOv7-tiny to achieve lower computing costs and faster detection speed. Finally, Inner-CIoU is used as the bounding box regression loss function of CDI-YOLO to speed up the bounding box regression process. The experimental results show that the method achieves 98.3% mAP on the PCB defect dataset, the detection speed is 128 frames per second (FPS), the parameters is 5.8 M, and the giga floating-point operations per second (GFLOPs) is 12.6 G. Compared with the existing methods, the comprehensive performance of this method has advantages.


Introduction of coordinated attention module
Due to the small defects of PCBs, feature information may not be immediately apparent and can be influenced by various environmental factors.To enhance the YOLOv7-tiny network model's ability to extract defect information from PCBs, this study incorporates the CA module into the ELAN module of YOLOv7-tiny 23 .This improves the accuracy of PCB defect detection.
CA module is an attention mechanism used in computer vision tasks to improve model performance by enhancing feature representation.Traditional attention mechanisms focus on the channel dimensions of the feature map, dynamically adjusting the feature importance between channels by learning weights.Conversely, CA module concentrates on the spatial location of the feature graph and adjusts the importance of different spatial locations by learning their weights.The fundamental concept of CA module is to incorporate the location information of the feature graph into the attention weights.The approach takes into account that features located in different areas may have varying contributions to the task.As a result, it adjusts the importance of features by learning the weights of their locations to better capture spatially structured information.The CA module encodes channel relationships and remote dependencies through two steps: coordinate information embedding and coordinate attention generation.Figure 3 illustrates the coordinate attention module, and the detailed principle of CA module is described below.
(1) Coordinate Information Embedding.The input feature map X undergoes pooling operations along the horizontal and vertical directions using two pooling kernels, (H, 1) and (1, W) , respectively.Equation (1) shows the output of channel c in the vertical direction h.Equation (2) shows the output of channel c in the horizontal direction w.
The horizontal and vertical outputs are then spliced to obtain a pair of orientation-aware feature maps Z.
(2) Coordinate Attention Generation.Equation (3) shows that the feature map Z , obtained through coordinate information embedding, is input into a 1 × 1 convolutional kernel F 1 , followed by a nonlinear activation operation δ.
The feature map f , obtained by Eq. ( 3), is split into two tensors: f h ∈ R C/r×H and f w ∈ R C/r×W , along the horizontal and vertical directions, respectively.Following this, two 1 × 1 convolution kernels, F h and F w , are   www.nature.com/scientificreports/used to convert f h and f w into tensors g h and g w , respectively, with the same number of channels as the input X .These are computed as shown in Eq. ( 4) and Eqs.(5).
where σ is the sigmoid activation function.Then the outputs g h and g w from Eq. ( 4) and Eq. ( 5) are multiplied as weights with the initially input feature map X .Finally, the output Y of the coordinate attention module is shown in Eq. ( 6).
Through these steps, CA module can adjust feature weights based on location importance, improving the model's ability to capture spatially structured information.This mechanism enhances the model's accuracy in perceiving and understanding important spatial locations in computer vision tasks.

DSConv module
DSConv 24 is a variant of the traditional convolutional layer.By replacing the ordinary convolution with DSConv, it is possible to achieve lower computation and higher detection speed.The principle of DSConv is shown in Fig. 4.
DSConv decomposes the operation of a traditional convolutional layer into Variable Quantized Kernel (VQK) and Distribution Shifts.VQK is the quantised component of DSConv with the same size (ch o , ch i , k, k) as the original convolution tensor.Here, ch o denotes the number of output channels, ch i denotes the number of input channels, and k denotes the size of the convolution kernel.The parameter values are obtained by quantising the original floating-point model into variable bit-length integer values.Once the parameter values have been quantised, they cannot be changed.Distribution shifts are used to adjust the distribution of the VQK by two tensors: the Kernel Distribution Shifter (KDS) and the Channel Distribution Shifter (CDS).The KDS are used to carry out distribution shifts on each (1, BLK, 1, 1) slice of the VQK, on which the distribution is shifted.BLK is a hyperparameter that determines the block size for the VQK depth values in each displacement operation.Each value in the KDS corresponds to a displacement operation that shifts BLK depth values of the VQK.The size of the KDS is 2 • ch o , CEIL ch i BLK , k, k where CEIL(x) is an upward rounding operator used to ensure that the computed dimensions satisfy the requirements.The size of CDS is 2 • (ch o ) .The CDS distributes the displace- ments on each channel by performing a distributed displacement operation on each (1, ch i , k, k) slice.

Inner-CIoU loss
The YOLOv7-tiny model employs the CIoU bounding box regression loss function.However, this function has the disadvantage of slow convergence.To address this issue, we use the Inner-CIoU loss 25 as the bounding box regression loss function.
The Inner-CIoU loss calculates the loss based on the CIoU loss using an auxiliary bounding box, which is defined as shown in Eq. ( 7). ( 4) where L CIoU denotes the CIoU loss function, IoU denotes the intersection and concatenation ratio of the pre- dicted and real frames, and IoU inner is defined as shown in Eq. ( 8).
The definitions of inter and union are shown in Eq. ( 9) and Eqs. ( 10

Experimental environment
The operating system used in this experiment is Windows 11 64-bit operating system, the CPU is Intel(R) Core(TM) i5-13400F @ 2.60 GHz, the GPU is NVIDIA GeForce RTX 3060 with 12 GB of video memory, the

Evaluation metrics
We use mean Average Precision (mAP), Parameters, GFLOPs, and FPS as evaluation metrics.mAP is the average of the AP values of different PCB defects, defined as in Eq. (15).
where N denotes the number of PCB defect types and AP is the area enclosed by the PR curve, the calculation formula is shown in Eq. ( 16).
P is the precision, which indicates the probability of being correctly classified in the predicted positive sample and is calculated as shown in Eq. (17).
where TP denotes the number of samples that are predicted to be positive and true positive samples and FP denotes the number of samples that are predicted to be positive but true negative samples.R is the Recall, which represents the probability of being correctly classified among all positive samples and is calculated as shown in Eq. (18).
where FN denotes the number of samples that are predicted to be negative but true to be positive.

Ratio setting of inner-CIoU loss
To determine the appropriate ratio, we conducted experiments using the CDI-YOLO network model on the PCB Defect dataset by setting the ratio of Inner-CIoU to 0.6, 0.7, 0.8, 0.9, and 1.1, respectively.Table 2 shows the results of the comparative experiments conducted on the PCB Defect dataset.
By observing the experimental results in Table 2, we can see that when the ratio parameter is set to 0.7, the optimal values of P, R, mAP 50, and mAP 50:95 are obtained.Based on this observation, we decided to set the ratio of Inner-CIoU to 0.7.

Ablation experiment
To evaluate the impact of the CA module, DSConv, and Inner-CIoU loss functions on the performance of the YOLOv7 tiny network model, we performed comparative experiments on the PCB Defect dataset.Table 3 shows the results of the removal experiments.In the table, we use Model_1 to denote the baseline YOLOv7-tiny model, Model_2 to denote the introduction of the CA module, Model_3 to denote the introduction of DSConv, Model_4 to denote the introduction of the Inner-CIoU loss function, Model_5 to denote the simultaneous introduction of the CA module and DSConv, and Model_6 to denote the simultaneous introduction of the CA module and the Inner-CIoU loss function, Model_7 to denote the simultaneous introduction of DSConv and the Inner-CIoU loss function, and Model_8 to denote the CDI-YOLO network model.The results of these ablation experiments allow us to evaluate the impact of each module on the model performance.
From the experimental results of Model_2, it can be seen that the introduction of the CA module improves the feature extraction capability of the model for PCB defects and increases the mAP 50 to 96.4.Compared to YOLOv7-tiny, an improvement of 1% is achieved.However, this improvement is accompanied by a small increase in the number of model parameters and a decrease in FPS.This is because when we introduce the CA module, we need to introduce coordinate encoding parameters to represent position information, and these parameters increase the number of model parameters.In addition, the CA module needs to operate on each position in the feature map during the computation process, which increases the computational complexity of the model and leads to an increase in the computation time for each forward propagation step, which decreases the FPS.As can be seen from the experimental results of Model_3, we can observe that after replacing some of the ordinary convolutions of YOLOv7-tiny with DSConv, the GFLOPs of the model decreased by 4.5%, while the FPS increased by 6.1%.This shows that DSConv can reduce the computational complexity of the model and increase the speed of recognition.With the experimental results of Model_4, we can see that after using Inner-CIoU as the loss function of YOLOv7-tiny, the convergence speed of the model is improved, the bounding box prediction is more accurate, and the detection accuracy and speed are also improved, making the model's mAP 50 reach 97.0%.In the experimental results of Model_5, we observe that after the introduction of both the CA module and DSConv, DSConv can compensate to some extent for the decrease in FPS caused by the introduction of the CA module and reduce the model's GFLOPs.The experimental results of Model_6 show that under the combined effect of the CA module and the effect of the Inner-CIoU loss function, the model's mAP 50 improves by 2.2%.Similarly, the experimental results of Model_7 show that with the combined effect of the DSConv and Inner-CIoU loss functions, the model's mAP 50 improves by 2.0% and FPS improves by 6.9%.Finally, in the experimental results of Model_8, we can see that after the simultaneous introduction of these three modules, the model's mAP 50 reaches 98.3%, which is the best performance among all the models.Model_8 has the best performance in terms of AP 50 for the six PCB defect types, except that the AP 50 of the missing hole is lower than that of Model_5, and the other five PCB defect types have the AP 50 are all the highest.
In summary, compared to YOLOv7-tiny, CDI-YOLO performs well in all performance metrics except for a slight increase in the number of parameters and lower FPS.

Comparison experiment
To validate the advantages of the CDI-YOLO network model, we compared its performance with the existing mainstream methods (YOLOv3, YOLOv3-SPP, YOLOv4, YOLOR, YOLOv5s, YOLOv7, and YOLOv7-tiny) on the PCB Defect dataset.The results of the comparison experiments are shown in Table 4.
Table 4 compares the performance of the different models in terms of P, R, mAP 50 , mAP 50:95 , Parameters, GFLOPs, and FPS.CDI-YOLO achieves the best results in terms of P, R, mAP 50 , MAP 50:95 , and GFLOPs.CDI-YOLO's results are slightly worse than YOLOv7-tiny only in terms of Parameters and FPS.
In general, compared with the existing mainstream methods, CDI-YOLO is slightly inferior to YOLOv7tiny in terms of the number of parameters and detection speed, but the gap is not large.It is worth noting that CDI-YOLO shows higher detection accuracy on the PCB Defect dataset.This proves that CDI-YOLO can solve the problems that the existing methods cannot achieve at the same time in terms of high detection accuracy, fast detection, and fewer parameters, making it a suitable choice for real-time detection deployed on hardware devices.Looking at the detection results in row 1, we can see that the YOLOv7-tiny algorithm has a lower confidence level in its detection results compared to the CDI-YOLO algorithm.In the detection results of row 2, the YOLOv7-tiny algorithm has a false detection situation.In the detection results of rows 3, 4, 5, and 6, the YOLOv7-tiny algorithm has a missed detection situation.Taken together, the detection results of the CDI-YOLO algorithm are better than those of the YOLOv7-tiny algorithm.

Conclusion
This paper proposes a PCB defect detection algorithm based on CDI-YOLO.The algorithm introduces a CA module into the YOLOv7-tiny object detection algorithm to better understand and utilize spatial information, thereby enhancing perception and reasoning capabilities for detecting defects at different positions on the PCB.Additionally, selected regular convolutional layers are replaced with DSConv to reduce model complexity and improve detection speed.Furthermore, Inner-CIoU is employed as the bounding box regression loss function, leveraging auxiliary bounding boxes to expedite the model's bounding box regression speed.Experimental results demonstrate that CDI-YOLO achieves the highest mAP of 98.3% in terms of detection accuracy compared to existing methods.In terms of parameters, CDI-YOLO has 5.8 M parameters, slightly higher than YOLOv7-tiny but with negligible difference.In terms of detection speed, CDI-YOLO achieves a speed of 128 FPS, slightly lower than YOLOv7-tiny but capable of meeting real-time detection requirements.Therefore, the proposed method successfully addresses the simultaneous challenges of achieving high detection accuracy, fast detection, and reduced parameter count, providing an excellent solution for practical PCB defect detection systems.
However, in practical application scenarios, there are various interfering factors such as complex backgrounds, lighting variations, and noise, which can affect the accuracy of the model's detection.To improve the detection accuracy of the model in real-world scenarios, we plan to augment our dataset of PCB defect samples with more instances that contain complex backgrounds.This will allow us to train our model and enhance its robustness.Additionally, annotating a large number of defect samples in PCB defect detection is a time-consuming and expensive task.Future research can explore the use of weakly supervised learning methods, such as weak labeling, unlabeled data, and semi-supervised learning, to improve the effectiveness of defect detection.This approach will help reduce the demand for a large amount of annotated data, thereby lowering costs and improving detection performance.
b t are defined as follows.As shown in Fig.5, xc and y c denote the coordinates of the centroid of the prediction box, x gt c and y gt c denote the coordinates of the centroid of the true box, w and h denote the width and height of the prediction box, w gt and h gt denote the width and height of the true box, and ratio denotes the scaling factor for generating the auxiliary bounding box, which generally takes a range of values between [0.5, 1.5].
inter = min b gt r , b r − max b gt l , b l * min b gt b , b b − max b gt t , b t (10) union = w gt * h gt * (ratio) 2 + (w * h) * (ratio) 2 − inter

Figure 7
shows a comparison of the detection results of the YOLOv7-tiny algorithm and the CDI-YOLO algorithm on six types of PCB defects.The first column shows the location of the defect with a red box, the second column shows the detection results of the YOLOv7-tiny algorithm, and the third column shows the detection results of the CDI-YOLO algorithm.Each row uses the same image and the corresponding defect types are missing hole, mouse bite, open circuit, short, spur, and spurious copper.

Figure 7 .
Figure 7.Comparison of the detection results of YOLOv7-tiny and CDI-YOLO.

Table 1 .
running memory is 16 GB, the programming language is Python 3.8, the deep learning framework is Pytorch 1.8.1, and CUDA version is CUDA 11.1.In the training process of the model, the input image size is 608 × 608, the batch size is 8, the number of training rounds is 200, the optimizer is Adam optimizer, the momentum is 0.937, the weights decay is 0.0005, the initial learning rate is 0.01, and the learning rate is reduced by a cosine function.Dataset preprocessing.The experiment utilised the PCB Defect dataset, released by the Intelligent Robot Open Laboratory of Peking University (http:// robot ics.pkusz.edu.cn/resources/dataset/).The dataset comprises 693 images of PCB defects, which were cropped to produce 10,668 images.The dataset consists of 10,668 images, each containing one of six types of defects: missing hole, mouse bite, open circuit, short, spur, and spurious copper.Table1shows the number of images for each defect type.The defects in the PCB images were labeled using the LabelImg tool and stored in the Pascal VOC dataset format.The dataset was then divided into a training set and a test set in an 8:2 ratio.Figure6displays images of defects in the PCB Defect dataset.The red boxed areas indicate the defective parts.Dataset of PCB defect.

Table 2 .
Performance comparison of inner-CIoU with different ratios.

Table 3 .
Ablation experiment.Significant values are in bold